86 research outputs found

    Facial Action Unit Detection Using Attention and Relation Learning

    Full text link
    Attention mechanism has recently attracted increasing attentions in the field of facial action unit (AU) detection. By finding the region of interest of each AU with the attention mechanism, AU-related local features can be captured. Most of the existing attention based AU detection works use prior knowledge to predefine fixed attentions or refine the predefined attentions within a small range, which limits their capacity to model various AUs. In this paper, we propose an end-to-end deep learning based attention and relation learning framework for AU detection with only AU labels, which has not been explored before. In particular, multi-scale features shared by each AU are learned firstly, and then both channel-wise and spatial attentions are adaptively learned to select and extract AU-related local features. Moreover, pixel-level relations for AUs are further captured to refine spatial attentions so as to extract more relevant local features. Without changing the network architecture, our framework can be easily extended for AU intensity estimation. Extensive experiments show that our framework (i) soundly outperforms the state-of-the-art methods for both AU detection and AU intensity estimation on the challenging BP4D, DISFA, FERA 2015 and BP4D+ benchmarks, (ii) can adaptively capture the correlated regions of each AU, and (iii) also works well under severe occlusions and large poses.Comment: This paper is accepted by IEEE Transactions on Affective Computin

    Fine-Grained Expression Manipulation via Structured Latent Space

    Full text link
    Fine-grained facial expression manipulation is a challenging problem, as fine-grained expression details are difficult to be captured. Most existing expression manipulation methods resort to discrete expression labels, which mainly edit global expressions and ignore the manipulation of fine details. To tackle this limitation, we propose an end-to-end expression-guided generative adversarial network (EGGAN), which utilizes structured latent codes and continuous expression labels as input to generate images with expected expressions. Specifically, we adopt an adversarial autoencoder to map a source image into a structured latent space. Then, given the source latent code and the target expression label, we employ a conditional GAN to generate a new image with the target expression. Moreover, we introduce a perceptual loss and a multi-scale structural similarity loss to preserve identity and global shape during generation. Extensive experiments show that our method can manipulate fine-grained expressions, and generate continuous intermediate expressions between source and target expressions

    Spatio-Temporal Relation and Attention Learning for Facial Action Unit Detection

    Full text link
    Spatio-temporal relations among facial action units (AUs) convey significant information for AU detection yet have not been thoroughly exploited. The main reasons are the limited capability of current AU detection works in simultaneously learning spatial and temporal relations, and the lack of precise localization information for AU feature learning. To tackle these limitations, we propose a novel spatio-temporal relation and attention learning framework for AU detection. Specifically, we introduce a spatio-temporal graph convolutional network to capture both spatial and temporal relations from dynamic AUs, in which the AU relations are formulated as a spatio-temporal graph with adaptively learned instead of predefined edge weights. Moreover, the learning of spatio-temporal relations among AUs requires individual AU features. Considering the dynamism and shape irregularity of AUs, we propose an attention regularization method to adaptively learn regional attentions that capture highly relevant regions and suppress irrelevant regions so as to extract a complete feature for each AU. Extensive experiments show that our approach achieves substantial improvements over the state-of-the-art AU detection methods on BP4D and especially DISFA benchmarks

    Size dependent effectiveness of engineering and administrative control strategies for both short- and long-range airborne transmission control

    Get PDF
    Ventilation is recognized as an effective mitigation strategy for long-range airborne transmission. However, a recent study by Li et al. revealed its potential impact on short-range airborne transmission as well. Our study extends their work by developing size-dependent transmission models for both short- and long-range airborne transmission and evaluates the impact of various control strategies, including ventilation. By adopting a recently determined mode-dependent viral load, we first analyzed the role of different sizes of droplets in airborne transmission. In contrast to models with a constant viral load where large droplets contain more viruses, our findings demonstrated that droplets ranging from ∼2–4 μm are more critical for short-range airborne transmission. Meanwhile, droplets in the ∼1–2 μm range play a significant role in long-range airborne transmission. Furthermore, our study indicates that implementing a size-dependent filtration/mask strategy considerably affects the rate of change (ROC) of virus concentration in relation to both distancing and ventilation. This underscores the importance of factoring in droplet size during risk assessment. Engineering controls, like ventilation and filtration, as well as administrative controls, such as distancing and masks, have different effectiveness in reducing virus concentration. Our findings indicate that high-efficiency masks can drastically reduce virus concentrations, potentially diminishing the impacts of other strategies. Given the size-dependent efficiency of filtration, ventilation has a more important role in reducing virus concentration than filtration, especially for long-range airborne transmission. For short-range airborne transmission, maintaining distance is far more effective than ventilation, and its effectiveness is largely unaffected by ventilation. However, the influence of ventilation on virus concentration and its variation with the distance mainly depend on the specific transmission model utilized. In sum, this research delineates the differential roles of droplet sizes and control strategies in both short- and long-range airborne transmission, offering valuable insights for future size-dependent airborne transmission control measures

    CT-Net: Arbitrary-Shaped Text Detection via Contour Transformer

    Full text link
    Contour based scene text detection methods have rapidly developed recently, but still suffer from inaccurate frontend contour initialization, multi-stage error accumulation, or deficient local information aggregation. To tackle these limitations, we propose a novel arbitrary-shaped scene text detection framework named CT-Net by progressive contour regression with contour transformers. Specifically, we first employ a contour initialization module that generates coarse text contours without any post-processing. Then, we adopt contour refinement modules to adaptively refine text contours in an iterative manner, which are beneficial for context information capturing and progressive global contour deformation. Besides, we propose an adaptive training strategy to enable the contour transformers to learn more potential deformation paths, and introduce a re-score mechanism that can effectively suppress false positives. Extensive experiments are conducted on four challenging datasets, which demonstrate the accuracy and efficiency of our CT-Net over state-of-the-art methods. Particularly, CT-Net achieves F-measure of 86.1 at 11.2 frames per second (FPS) and F-measure of 87.8 at 10.1 FPS for CTW1500 and Total-Text datasets, respectively.Comment: This paper has been accepted by IEEE Transactions on Circuits and Systems for Video Technolog

    IterativePFN: True Iterative Point Cloud Filtering

    Full text link
    The quality of point clouds is often limited by noise introduced during their capture process. Consequently, a fundamental 3D vision task is the removal of noise, known as point cloud filtering or denoising. State-of-the-art learning based methods focus on training neural networks to infer filtered displacements and directly shift noisy points onto the underlying clean surfaces. In high noise conditions, they iterate the filtering process. However, this iterative filtering is only done at test time and is less effective at ensuring points converge quickly onto the clean surfaces. We propose IterativePFN (iterative point cloud filtering network), which consists of multiple IterationModules that model the true iterative filtering process internally, within a single network. We train our IterativePFN network using a novel loss function that utilizes an adaptive ground truth target at each iteration to capture the relationship between intermediate filtering results during training. This ensures that the filtered results converge faster to the clean surfaces. Our method is able to obtain better performance compared to state-of-the-art methods. The source code can be found at: https://github.com/ddsediri/IterativePFN.Comment: This paper has been accepted to the IEEE/CVF CVPR Conference, 202

    Exploring core mental health symptoms among persons living with HIV: A network analysis

    Get PDF
    ContextPersons living with HIV (PLWH) commonly experience mental health symptoms. However, little is known about the core mental health symptoms and their relationships.ObjectiveThis study aimed to evaluate the prevalence of various mental health symptoms and to explore their relationships in symptom networks among PLWH.MethodsFrom April to July 2022, we recruited 518 participants through convenience sampling in Beijing, China, for this cross-sectional study. Forty mental health symptoms, including six dimensions (somatization symptoms, negative affect, cognitive function, interpersonal communication, cognitive processes, and social adaptation), were assessed through paper-based or online questionnaires. Network analysis was performed in Python 3.6.0 to explore the core mental health symptoms and describe the relationships among symptoms and clusters.ResultsOf the 40 mental health symptoms, the most common symptoms were fatigue (71.2%), trouble remembering things (65.6%), and uncertainty about the future (64.0%). In the single symptom network, sadness was the most central symptom across the three centrality indices (rS = 0.59, rC = 0.61, rB = 0.06), followed by feeling discouraged about the future (rS = 0.51, rC = 0.57, rB = 0.04) and feelings of worthlessness (rS = 0.54, rC = 0.53, rB = 0.05). In the symptom cluster network, negative affect was the most central symptom cluster across the three centrality indices (rS = 1, rC = 1, rB = 0.43).ConclusionOur study provides a new perspective on the role of each mental health symptom among PLWH. To alleviate the mental health symptoms of PLWH to the greatest extent possible and comprehensively improve their mental health, we suggest that psychological professionals pay more attention to pessimistic mood and cognitive processes in PLWH. Interventions that apply positive psychology skills and cognitive behavioral therapy may be necessary components for the mental health care of PLWH
    • …
    corecore